Filter based Taxonomy Modification for Improving Hierarchical Classification

نویسندگان

  • Azad Naik
  • Huzefa Rangwala
چکیده

Large scale classification of data organized as a hierarchy of classes has received significant attention in the literature. Top-Down (TD) Hierarchical Classification (HC), which exploits the hierarchical structure during the learning process is an effective method for dealing with problems at scale due to its computational benefits. However, its accuracy suffers due to error propagation i.e., prediction errors made at higher levels in the hierarchy cannot be corrected at lower levels. One of the main reasons behind errors at the higher levels is the presence of inconsistent nodes and links that are introduced due to the arbitrary process of creating these hierarchies by domain experts. In this paper, we propose two efficient data driven filter based approaches for hierarchical structure modification: (i) Flattening (local and global) approach that identifies and removes inconsistent nodes present within the hierarchy and (ii) Rewiring approach modifies parent-child relationships to improve the classification performance of learned models. Our extensive empirical evaluation of the proposed approaches on several image and text datasets shows improved performance over competing approaches. Source code available for reproducibility at: www.cs.gmu.edu/∼mlbio/TaxMod

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral-spatial classification of hyperspectral images by combining hierarchical and marker-based Minimum Spanning Forest algorithms

Many researches have demonstrated that the spatial information can play an important role in the classification of hyperspectral imagery. This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. In the proposed method ten spatial/texture features, using mean, standard deviation, contrast, homogeneity, corr...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

Exploiting Label Dependency for Hierarchical Multi-label Classification

Hierarchical multi-label classification is a variant of traditional classification in which the instances can belong to several labels, that are in turn organized in a hierarchy. Existing hierarchical multi-label classification algorithms ignore possible correlations between the labels. Moreover, most of the current methods predict instance labels in a “flat” fashion without employing the ontol...

متن کامل

Acclimatizing Taxonomic Semantics for Hierarchical Content Classification

Hierarchical models have been shown to be effective in content classification. However, we observe through empirical study that the performance of a hierarchical model varies with given taxonomies; even a semantically sound taxonomy has potential to change its structure for better classification. By scrutinizing typical cases, we elucidate why a given semantics-based hierarchy does not work wel...

متن کامل

Investigate Factors Affecting on the Performance of Agricultural Machinery Companies Based on Taxonomy Algorithm

Taxonomy(general), the practice and science of classification of things or concepts, including the principles that underlie such classification. Economic taxonomy, a system of classification for economic activity. The main objective of the study was to find whether financial ratios affect the performance of the Agricultural Machinery companies in Iran. A firm performance evaluation and its comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1603.00772  شماره 

صفحات  -

تاریخ انتشار 2016